Skip to content

caddyhttp, reverseproxy: experimental WebTransport passthrough#7669

Open
tomholford wants to merge 16 commits intocaddyserver:masterfrom
tomholford:webtransport-reverse-proxy
Open

caddyhttp, reverseproxy: experimental WebTransport passthrough#7669
tomholford wants to merge 16 commits intocaddyserver:masterfrom
tomholford:webtransport-reverse-proxy

Conversation

@tomholford
Copy link
Copy Markdown

@tomholford tomholford commented Apr 23, 2026

Summary

Revives #5421. Adds experimental WebTransport (draft-ietf-webtrans-http3) reverse-proxy passthrough to reverse_proxy's http transport. HTTP/3 Extended CONNECT with :protocol=webtransport is upgraded client-side, re-dialed upstream, and the two sessions are bridged (bidirectional streams, unidirectional streams, and datagrams in both directions).

Motivation

#5421 documented that Caddy couldn't proxy WebTransport and was closed pending HTTP/3 upstream support (which landed experimentally May 2024 via #5086). w3c/webtransport#525 confirms (per the IETF WG chair) there is no transport-level shortcut — a proxy must understand WT framing. The core team has said community contributions are welcome, and browser support for WebTransport is now broad.

User-facing

Enable WebTransport on the HTTP/3 server with one server-level directive — the same shape as protocols, allow_0rtt, or enable_full_duplex. reverse_proxy auto-detects the WebTransport Extended CONNECT the same way it auto-detects a WebSocket upgrade today; no per-handler config needed.

JSON:

{
  "apps": {
    "http": {
      "servers": {
        "srv0": {
          "listen": [":443"],
          "enable_webtransport": true,
          "routes": [{
            "handle": [{
              "handler": "reverse_proxy",
              "transport": {
                "protocol": "http",
                "versions": ["3"],
                "tls": {}
              },
              "upstreams": [{"dial": "backend:9443"}]
            }]
          }]
        }
      }
    }
  }
}

Caddyfile:

{
    servers {
        enable_webtransport
    }
}

example.com {
    reverse_proxy https://backend:9443 {
        transport http {
            versions 3
        }
    }
}

Implementation

Fifteen atomic commits (ten original + five in response to review); each compiles and passes its own tests:

  • caddyhttp: EnableWebTransport server-level flag (opt-in, EXPERIMENTAL). When true, advertises WebTransport in HTTP/3 SETTINGS, enables EnableDatagrams + EnableStreamResetPartialDelivery on the QUIC listener config, and dispatches each QUIC connection through webtransport.Server.ServeQUICConn. When false, the HTTP/3 path is bit-for-bit identical to pre-WebTransport Caddy (falls back to http3.Server.ServeListener). No runtime cost for HTTP/3 deployments that don't opt in.
  • caddyhttp: UnwrapResponseWriterAs[T] helper — Go's type assertion x.(T) doesn't follow Unwrap() http.ResponseWriter chains, and webtransport-go's Upgrade requires direct-asserting http3.Settingser / http3.HTTPStreamer on the naked writer.
  • caddyhttp (test-only): terminating WebTransport echo handler, registered as http.handlers.webtransport but only in the integration test binary (caddytest/integration/webtransport_echo_test.go, mirroring the mockdns_test.go pattern). Production Caddy builds don't include it.
  • reverseproxy: upstream WT dialer, session pump (six goroutines for bidi/uni streams + datagrams in both directions), and ServeHTTP branch. WT is detected by request shape (:method=CONNECT, :protocol=webtransport) the same way WebSocket upgrades are detected — no per-handler flag.
  • reverseproxy feature parity (3 commits): (1) run the same request-preparation pipeline as the normal proxy path so header_up, X-Forwarded-*, Via, Rewrite, upstream placeholders, dynamic upstreams, countFailure, and {http.reverse_proxy.duration{_ms}} all behave the same. (2) Dial the upstream before upgrading the client so an unreachable upstream surfaces as a 5xx on the client's Dial() instead of a bare post-upgrade session close — and so h.Headers.Response can be applied to the client-visible 200 OK. (3) Track active WT sessions in Host.NumRequests / the in-flight counter so MaxRequests gating, LeastConn/FirstAvailable LB, and the admin /reverse_proxy/upstreams endpoint reflect WT load.
  • Review iteration (5 commits): extracted shared upstream-selection helpers to eliminate duplication between the normal and WT proxy paths; inlined the Protocol const + Writer interface into the reverseproxy package; moved the echo handler out of the production module tree into the integration test binary; introduced the enable_webtransport server flag (collapsing the earlier per-handler surface); added micro-benchmarks confirming the flag-off path is strictly cheaper than flag-on.

Retries and handle_response are intentionally skipped — WT sessions are long-lived and there's no HTTP response body to post-process.

Limitations / design notes (flagging for review)

  • Draft protocol churn. Lands with whatever draft quic-go/webtransport-go supports at build time. Browser updates will require dep bumps. Surface is EXPERIMENTAL.
  • New dep: github.com/quic-go/webtransport-go. Same maintainers as the existing quic-go dependency; uses quic-go/http3 internals we can't practically reimplement. Wrapped behind a Caddy-internal any accessor (Server.WebTransportServer()) so caddyhttp's public API does not name the upstream type, per AGENTS.md.
  • Gated at the server level. enable_webtransport: false (the default) leaves the core HTTP/3 accept path, QUIC config, and SETTINGS advertisement identical to pre-PR Caddy. The webtransport-go dep is still pulled in at build time; keeping it out entirely would require build tags, which isn't how Caddy gates features elsewhere.
  • Close-code propagation client→upstream is best-effort. webtransport-go's Dialer tears down the dedicated QUIC connection immediately after CloseWithError, racing the WT_CLOSE_SESSION capsule. The close itself propagates reliably; only the specific numeric code can be lost. upstream→client direction propagates codes reliably.
  • Stream reset codes are not propagated (the pump uses io.Copy, which handles FIN but not RESET_STREAM + code). App protocols that use reset codes for signaling (e.g. MoQ) would see graceful EOF instead. Follow-up.
  • No retries on a failed upstream dial — a 5xx is returned from the handler so the client's Dial() fails fast. WT sessions are long-lived; retry semantics are app-level. Matches HTTP spec expectations for WebSocket-like upgrades.
  • StreamTimeout does not apply to WT sessions. The normal path uses it to forcibly close streaming requests; for WT it's session-level and semantically different. Documented rather than forced to map. Follow-up if users need it.

Test plan

  • modules/caddyhttp unit tests: WebTransport SETTINGS advertised on the built http3.Server when EnableWebTransport=true, and confirmed absent when false (regression guard for the server-level gate); UnwrapResponseWriterAs traversal through single/multi wrappers + defensive self-reference handling.
  • modules/caddyhttp micro-benchmarks: BenchmarkBuildHTTP3Server_WebTransportOff (~70 ns/op, 392 B/op, 3 allocs/op on Apple M4) vs _On (~144 ns/op, 600 B/op, 6 allocs/op), confirming the flag-off path is strictly cheaper.
  • caddytest/integration echo-handler unit tests: request-shape detection, pass-through for non-WT requests.
  • modules/caddyhttp/reverseproxy unit tests: dialer happy path + header forwarding + bad-address; pump bidi/uni/datagram round-trip, close-code propagation both directions, goroutine-leak sanity check.
  • caddytest/integration: five tests covering the full matrix — (1) real webtransport.Dialer → Caddy H3 → webtransport echo handler asserting bidi echo; (2) real dialer → Caddy proxy → second Caddy upstream, bidi echo through the pump; (3) standalone-upstream test inspecting the forwarded Extended CONNECT, asserting headers.request.set, X-Forwarded-For, and Via; (4) dial against an unbound port, asserting the client sees a 5xx status (not post-upgrade close); (5) active-session test that polls /reverse_proxy/upstreams and asserts num_requests increments during the session and drops to 0 after close.
  • Caddyfile adapt round-trip for the enable_webtransport global server option (covered by the updated global_server_options_single.caddyfiletest).
  • Manual smoke test: 200 concurrent bidi streams, 5.5 KB payloads, clean graceful shutdown.

Assistance Disclosure

Claude collaborated on research, design, implementation, and tests; I reviewed each commit, exercised the code end-to-end against real binaries, and iterated on the design where I disagreed with generated output.


Appendix: manual end-to-end reproduction

The steps below reproduce the WT proxy end-to-end against a real webtransport.Dialer client hitting a real Caddy binary (not the in-process caddytest.Tester). Because the echo handler now lives only in the integration-test binary, the upstream in this topology is a small standalone wt-echo-server rather than a second Caddy instance.

What it verifies

  • The production Caddy binary accepts a live WebTransport session on its HTTP/3 listener when enable_webtransport is on.
  • A real (non-test-harness) WT client can dial the proxy and round-trip bidirectional-stream payloads via the pump to an independent WT server.
  • Concurrent sessions + non-trivial payloads + graceful shutdown all behave.
  • Regression check: with enable_webtransport: false, the core HTTP/3 path is unchanged — WT SETTINGS are not advertised and plain H3 behaves as on master.

Build + config

/tmp/wt-echo-server/main.go — ~70-line standalone WT echo server (used as the upstream)
// Standalone WebTransport echo server for manual e2e verification.
// Used as the upstream behind the Caddy WT proxy since the echo handler
// is no longer shipped in the production Caddy binary.
package main

import (
	"crypto/tls"
	"flag"
	"io"
	"log"
	"net"
	"net/http"

	"github.com/quic-go/quic-go"
	"github.com/quic-go/quic-go/http3"
	"github.com/quic-go/webtransport-go"
)

func main() {
	addr := flag.String("addr", "127.0.0.1:9444", "UDP listen addr")
	cert := flag.String("cert", "", "TLS cert file")
	key := flag.String("key", "", "TLS key file")
	flag.Parse()

	c, err := tls.LoadX509KeyPair(*cert, *key)
	if err != nil {
		log.Fatalf("load cert: %v", err)
	}

	mux := http.NewServeMux()
	h3 := &http3.Server{
		TLSConfig: &tls.Config{Certificates: []tls.Certificate{c}, NextProtos: []string{http3.NextProtoH3}},
		Handler:   mux,
		QUICConfig: &quic.Config{
			EnableDatagrams:                  true,
			EnableStreamResetPartialDelivery: true,
		},
	}
	webtransport.ConfigureHTTP3Server(h3)
	wt := &webtransport.Server{H3: h3}

	mux.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
		sess, err := wt.Upgrade(w, r)
		if err != nil {
			log.Printf("upgrade: %v", err)
			w.WriteHeader(http.StatusBadRequest)
			return
		}
		ctx := sess.Context()
		for {
			s, err := sess.AcceptStream(ctx)
			if err != nil {
				return
			}
			go func(st *webtransport.Stream) {
				_, _ = io.Copy(st, st)
				_ = st.Close()
			}(s)
		}
	})

	udpAddr, _ := net.ResolveUDPAddr("udp", *addr)
	conn, err := net.ListenUDP("udp", udpAddr)
	if err != nil {
		log.Fatalf("listen udp: %v", err)
	}
	log.Printf("wt-echo-server listening on %s", *addr)
	if err := wt.Serve(conn); err != nil {
		log.Fatal(err)
	}
}
/tmp/caddy-wt-proxy.json — proxy Caddy on :9443 forwarding to the echo upstream on :9444

Uses the test cert Caddy's own integration tests ship with (caddytest/a.caddy.localhost.crt). Adjust the absolute paths to match your checkout.

{
  "admin": {"listen": "localhost:2999"},
  "apps": {
    "http": {
      "http_port": 9080,
      "https_port": 9443,
      "servers": {
        "proxy": {
          "listen": [":9443"],
          "protocols": ["h3"],
          "enable_webtransport": true,
          "routes": [{"handle": [{
            "handler": "reverse_proxy",
            "transport": {
              "protocol": "http",
              "versions": ["3"],
              "tls": {"insecure_skip_verify": true}
            },
            "upstreams": [{"dial": "127.0.0.1:9444"}]
          }]}],
          "tls_connection_policies": [{
            "certificate_selection": {"any_tag": ["cert0"]},
            "default_sni": "a.caddy.localhost"
          }]
        }
      }
    },
    "tls": {
      "certificates": {
        "load_files": [{
          "certificate": "<path-to-caddy-repo>/caddytest/a.caddy.localhost.crt",
          "key": "<path-to-caddy-repo>/caddytest/a.caddy.localhost.key",
          "tags": ["cert0"]
        }]
      }
    },
    "pki": {"certificate_authorities": {"local": {"install_trust": false}}}
  }
}
/tmp/wt-poke/main.go — 112-line Go WebTransport smoke-test client

Opens N parallel bidirectional streams, writes an indexed payload on each, reads the echo, asserts equality, and reports per-stream timings.

// wt-poke: a tiny WebTransport client for smoke-testing a Caddy WT
// reverse-proxy end-to-end. Dials the given URL, opens N bidirectional
// streams, writes a payload on each, reads the echo, and reports timing.

package main

import (
	"context"
	"crypto/tls"
	"flag"
	"fmt"
	"io"
	"log"
	"os"
	"sync"
	"time"

	"github.com/quic-go/quic-go"
	"github.com/quic-go/quic-go/http3"
	"github.com/quic-go/webtransport-go"
)

func main() {
	url := flag.String("url", "https://localhost:9443/", "WebTransport endpoint")
	payload := flag.String("payload", "hello webtransport via caddy", "bytes to send per stream")
	streams := flag.Int("streams", 3, "parallel bidirectional streams")
	timeout := flag.Duration("timeout", 10*time.Second, "overall deadline")
	flag.Parse()

	ctx, cancel := context.WithTimeout(context.Background(), *timeout)
	defer cancel()

	dialer := &webtransport.Dialer{
		TLSClientConfig: &tls.Config{
			InsecureSkipVerify: true,
			NextProtos:         []string{http3.NextProtoH3},
		},
		QUICConfig: &quic.Config{
			EnableDatagrams:                  true,
			EnableStreamResetPartialDelivery: true,
		},
	}

	dialStart := time.Now()
	rsp, sess, err := dialer.Dial(ctx, *url, nil)
	if err != nil {
		log.Fatalf("dial %s: %v", *url, err)
	}
	defer sess.CloseWithError(0, "")
	fmt.Printf("dialed in %v, status %d\n", time.Since(dialStart), rsp.StatusCode)

	var wg sync.WaitGroup
	var mu sync.Mutex
	failures := 0

	for i := 0; i < *streams; i++ {
		wg.Add(1)
		go func(i int) {
			defer wg.Done()
			start := time.Now()
			str, err := sess.OpenStreamSync(ctx)
			if err != nil {
				mu.Lock(); failures++; mu.Unlock()
				log.Printf("stream %d: open: %v", i, err); return
			}
			msg := fmt.Sprintf("[%d] %s", i, *payload)
			if _, err := io.WriteString(str, msg); err != nil {
				mu.Lock(); failures++; mu.Unlock()
				log.Printf("stream %d: write: %v", i, err); return
			}
			if err := str.Close(); err != nil {
				mu.Lock(); failures++; mu.Unlock()
				log.Printf("stream %d: close: %v", i, err); return
			}
			got, err := io.ReadAll(str)
			if err != nil {
				mu.Lock(); failures++; mu.Unlock()
				log.Printf("stream %d: read: %v", i, err); return
			}
			if string(got) != msg {
				mu.Lock(); failures++; mu.Unlock()
				log.Printf("stream %d: mismatch got=%q want=%q", i, got, msg); return
			}
			fmt.Printf("stream %d OK (%d bytes round-trip in %v)\n", i, len(got), time.Since(start))
		}(i)
	}

	wg.Wait()
	if failures > 0 {
		fmt.Printf("\nDONE with %d failure(s)\n", failures); os.Exit(1)
	}
	fmt.Println("\nOK — end-to-end WebTransport proxying works")
}

Reproduction steps

# 1. Build the Caddy binary from this branch
cd cmd/caddy && go build -o /tmp/caddy . && cd -

# 2. Build the echo upstream and the poke client (each its own tiny module)
mkdir -p /tmp/wt-echo-server && cd /tmp/wt-echo-server
# (paste main.go from above)
go mod init wt-echo-server && go mod tidy && go build -o /tmp/wt-echo-server-bin . && cd -

mkdir -p /tmp/wt-poke && cd /tmp/wt-poke
# (paste main.go from above)
go mod init wt-poke && go mod tidy && go build -o /tmp/wt-poke-bin . && cd -

# 3. Start the echo upstream and Caddy proxy
CERT=<path-to-caddy-repo>/caddytest/a.caddy.localhost.crt
KEY=<path-to-caddy-repo>/caddytest/a.caddy.localhost.key
/tmp/wt-echo-server-bin -addr 127.0.0.1:9444 -cert $CERT -key $KEY > /tmp/wt-echo.log 2>&1 &
/tmp/caddy run --config /tmp/caddy-wt-proxy.json > /tmp/caddy-wt.log 2>&1 &
sleep 1.5

# 4. Smoke tests
/tmp/wt-poke-bin -url https://a.caddy.localhost:9443/ -streams 5
/tmp/wt-poke-bin -url https://a.caddy.localhost:9443/ -streams 50 \
    -payload "$(head -c 4096 /dev/urandom | base64)"
/tmp/wt-poke-bin -url https://a.caddy.localhost:9443/ -streams 200

# 5. Regression check: flip enable_webtransport off and confirm WT fails clearly
sed 's/"enable_webtransport": true,/"enable_webtransport": false,/' /tmp/caddy-wt-proxy.json > /tmp/caddy-wt-proxy-off.json
# restart Caddy against the "off" config, then:
/tmp/wt-poke-bin -url https://a.caddy.localhost:9443/ -streams 1 -timeout 3s
# expected: "server didn't enable HTTP/3 datagram support"

# 6. Tear down
kill %1 %2

Observed result (Apple Silicon, macOS, loopback)

=== 5 streams, 32-byte payload ===
dialed in 13.7 ms, status 200
5/5 streams OK, ~350 µs round-trip each
OK — end-to-end WebTransport proxying works

=== 50 streams, 4 KB random payload ===
dialed in 5.1 ms, status 200
50/50 streams OK, ~4 ms p50
OK — end-to-end WebTransport proxying works

=== 200 streams, 32-byte payload ===
dialed in 4.7 ms, status 200
200/200 streams OK, ~4 ms p90
OK — end-to-end WebTransport proxying works

=== Regression (enable_webtransport: false) ===
dial https://a.caddy.localhost:9443/: server didn't enable HTTP/3 datagram support
(confirms WT SETTINGS are not advertised; core HTTP/3 path is unchanged)

Clean SIGTERM on both processes, no goroutine leaks, no port contention on tear-down.

Add github.com/quic-go/webtransport-go dep (built on quic-go, same
maintainers) and call webtransport.ConfigureHTTP3Server on the
http3.Server. This advertises WebTransport enablement in SETTINGS,
enables HTTP/3 DATAGRAMs, and stashes the *quic.Conn in each request's
context — a prerequisite for a later WebTransport-aware handler or
reverse-proxy transport to call webtransport.Server.Upgrade. Also
enable QUIC stream reset partial delivery, required by webtransport-go.

No user-visible behavior change: clients that don't speak WebTransport
ignore the extra SETTINGS, and no handler yet calls Upgrade.

Extract the http3.Server construction into buildHTTP3Server so the
SETTINGS assertions can be unit-tested without a live UDP listener.
Go's type assertion `x.(T)` does not follow Unwrap() http.ResponseWriter
chains. Caddy wraps the writer multiple times (logging recorder,
intercept, encode, etc.), so code that needs interfaces implemented only
by the raw writer owned by the HTTP server — for example the
http3.Settingser/HTTPStreamer interfaces that webtransport.Server.Upgrade
type-asserts — cannot see through those wrappers.

UnwrapResponseWriterAs walks the Unwrap() chain and returns the first
writer that satisfies the requested interface (or the zero value if none
do). Mirrors the traversal http.ResponseController performs internally.

Used by upcoming WebTransport handler and reverse-proxy transport.
Introduces http.handlers.webtransport, an EXPERIMENTAL handler that
terminates an incoming WebTransport session on top of Caddy's HTTP/3
server and echoes bytes on each bidirectional stream. Primary use case
is as a test upstream for the forthcoming WebTransport reverse-proxy
transport; it also serves as the minimal proof that the server-side
WebTransport wiring works end-to-end.

Plumbing changes:

  * caddyhttp.Server gains a *webtransport.Server field alongside
    h3server. It's built in buildWebTransportServer(), wrapping the
    existing http3.Server. Exposed via WebTransportServer() any on the
    Server, so the caddyhttp public API does not name the upstream
    webtransport-go type (per AGENTS.md).

  * serveHTTP3 now runs a custom accept loop (serveH3AcceptLoop) that
    dispatches each accepted QUIC connection to
    webtransport.Server.ServeQUICConn instead of
    http3.Server.ServeListener. The WebTransport server transparently
    forwards non-WT streams to the underlying http3 request handler
    (cost: one varint peek per stream), so behavior for non-WT clients
    is unchanged.

  * ListenQUIC enables EnableDatagrams and
    EnableStreamResetPartialDelivery on the QUIC listener config.
    These are capability bits negotiated during the QUIC handshake and
    are prerequisites for any WebTransport session; they do not force
    usage so non-WT H3 traffic is unaffected.

  * Stop path closes wtServer after h3server.Shutdown to clean up any
    remaining WebTransport session state.

The handler uses caddyhttp.UnwrapResponseWriterAs to reach the naked
http3.Settingser/HTTPStreamer writer through Caddy's wrapping chain
before calling webtransport.Server.Upgrade.

Includes unit tests for request-shape detection plus an integration
test (caddytest/integration/webtransport_test.go) that spins up a
Caddy HTTP/3 server with the handler, dials it with a real
webtransport.Dialer, and asserts end-to-end bidirectional-stream
echo.
dialUpstreamWebTransport is a thin wrapper around webtransport.Dialer.Dial
that sets the QUIC config flags WebTransport requires (EnableDatagrams,
EnableStreamResetPartialDelivery) and forwards request headers on the
Extended CONNECT. Intended as an internal building block for the
upcoming WebTransport reverse-proxy transport; not yet wired into
ServeHTTP.

Unit-tested against an in-process webtransport.Server with a freshly
minted self-signed certificate. Covers: successful dial, header
forwarding, and connection-refused against an unbound port.
runWebTransportPump bridges two WebTransport sessions so every
bidirectional stream, unidirectional stream, and datagram opened on one
side is mirrored on the other. Uses six goroutines (bidi both ways, uni
both ways, datagrams both ways) and blocks until both sessions end.

Close propagation: when either session ends, the peer is closed via
CloseWithError. The code/message are read from the closing session's
stored close state (by probing AcceptStream with a short timeout),
since Receive{Datagram,UniStream} return the underlying stream error
rather than the SessionError and can win the propagation race. Close
propagation is best-effort for client-initiated close through a
Dialer-dedicated QUIC conn: webtransport-go tears down the QUIC
connection immediately after CloseWithError, so the upstream may
observe a QUIC ApplicationError before the WT_CLOSE_SESSION capsule is
parsed. The pump still closes the peer session; only the specific
error code may not survive.

Not yet wired into ServeHTTP.

Tests: topology of client -> frontend -> upstream where frontend runs
the pump. Exercises bidi both ways, uni client-to-upstream, datagram
round-trip, CloseWithError propagation both ways, and a basic
goroutine-leak check.
Extends the http reverse-proxy transport with a webtransport boolean
that opts the upstream into WebTransport passthrough. Must be combined
with versions: ["3"]; WebTransport rides on HTTP/3 exclusively.

When enabled, Handler.ServeHTTP detects Extended CONNECT with
:protocol=webtransport early — before any of the normal round-trip
machinery — and branches to serveWebTransport, which:

  1. Pulls the *webtransport.Server off caddyhttp.Server (via
     WebTransportServer()) and errors out cleanly if HTTP/3 isn't
     enabled on the frontend.
  2. Picks a single upstream through the configured load-balancer.
     No retries: a failed dial closes the client session and returns.
  3. Walks the response-writer Unwrap() chain to reach the raw http3
     writer and calls webtransport.Server.Upgrade to terminate the
     incoming session.
  4. Uses dialUpstreamWebTransport to open a session to the selected
     upstream, forwarding request headers on the Extended CONNECT.
  5. Runs runWebTransportPump between the two sessions and blocks
     until both close.

The transport's wtTLSConfig is built at Provision time from the
existing TLS config (same path h3Transport already uses) and reused
for every session.

Tests: adds TestWebTransport_ReverseProxyEndToEnd which spins up a
single Caddy instance with two HTTP/3 servers — one proxy on :9443,
one terminating echo upstream on :9444 — and drives a real
webtransport.Dialer through the proxy to assert end-to-end
bidirectional-stream echo.
Adds a `webtransport` subdirective to the `transport http {}` block of
reverse_proxy that sets the new WebTransport bool on the transport.
Takes no arguments; exclusivity with versions 3 is enforced at
Provision time so parse order doesn't matter.

Example:
    reverse_proxy https://backend:9443 {
        transport http {
            versions 3
            webtransport
            tls_insecure_skip_verify
        }
    }

Includes a Caddyfile-to-JSON adapt test round-tripping the new
subdirective.
The WebTransport proxy path previously bypassed the request-preparation
pipeline that normal reverse-proxy traffic runs through. Reuse it so
`header_up`, `X-Forwarded-For`/`Host`/`Proto`, `Via`, `Rewrite`, the
`{http.reverse_proxy.upstream.*}` placeholders, dynamic upstreams,
`countFailure`, and the `{http.reverse_proxy.duration{_ms}}` timing
placeholder all behave the same as on the regular path.

Retries, `handle_response`, and response-header ops are intentionally
not run here — a WebTransport session has no HTTP response body to
post-process and is not idempotent. Integration test exercises the
header-forwarding contract end-to-end through a standalone (non-Caddy)
WebTransport upstream so the forwarded Extended CONNECT can be
inspected.
Reorder serveWebTransport so the upstream is dialed first. If the
upstream is unreachable or refuses the CONNECT, a proper 5xx is returned
from the handler — the client's Dial() surfaces the real status instead
of a successful upgrade followed by an opaque session close.

Also apply `h.Headers.Response` (gated by `Require`, if configured)
against the upstream response status/headers; the ops run on the
client-visible response headers, which webtransport.Server.Upgrade
flushes with the 200 OK. If the client-side upgrade fails after the
upstream dial succeeded, close the upstream session cleanly.

Integration test drives a dial to an unbound loopback port and asserts
the client sees a 5xx status instead of a bare session close.
Bracket the pump's lifetime with Host.countRequest(±1) and
incInFlightRequest/decInFlightRequest so WT sessions participate in the
same accounting as the normal proxy path:

- MaxRequests gating (Upstream.Full) now blocks WT sessions past the
  cap, instead of silently failing open.
- LeastConn / FirstAvailable selection sees WT load, instead of seeing
  busy upstreams as idle.
- Admin /reverse_proxy/upstreams reports WT sessions under num_requests.

Integration test holds an upstream session open via a standalone WT
server, polls the admin API to assert num_requests increments during
the session and drops back to 0 after close.
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Apr 23, 2026

CLA assistant check
All committers have signed the CLA.

@tomholford tomholford marked this pull request as ready for review April 23, 2026 03:16
@tomholford
Copy link
Copy Markdown
Author

@marten-seemann This wouldn't have been possible without your hard work on webtransport-go. If you have the bandwidth, would appreciate your feedback.


func (p *webtransportPump) run() {
var wg sync.WaitGroup
wg.Add(6)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a lot of goroutines to spin up for a single request. Do we really need 6? That seems wild, is it not possible to just have two?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These six are per-session, not per-stream, and fall out of the webtransport-go API shape. The library exposes three independent blocking accept methods — Session.AcceptStream (bidi), Session.AcceptUniStream (uni), and Session.ReceiveDatagram — each of which must be driven from its own goroutine because Go has no non-blocking variant to select over. Streams and datagrams can originate from either peer, so each direction needs its own accept loop: 3 × 2 = 6.

A WT session that serves thousands of streams still has only these 6 session-level goroutines, plus the per-stream splices (2 per active bidi stream, 1 per active uni stream).

I tried collapsing the pairs into selects fed by smaller goroutines, but that pushes the same six goroutines one level down for no runtime saving and more plumbing. Happy to revisit if webtransport-go upstream adds a unified accept API.

Copy link
Copy Markdown
Member

@francislavoie francislavoie Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you're saying for each reverse_proxy instance there will only be 6 goroutines? or 6 per upstream? instead of 6 per request I think. I'm not sure what a "session" means in this context.

If that's the case, that seems pretty reasonable. I just wanted to make sure goroutine count is kept in check cause it can mean a lot of memory buildup if not managed well (as we're trying to improve in #7649).

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. To clarify:

6 per active WebTransport session. Not per reverse_proxy instance, upstream, nor request.

"Session" is the IETF's own term (see draft-ietf-webtrans-http3 §3): it's established when the server sends a 2xx to an Extended CONNECT with :protocol=webtransport, and it's the single long-lived handle over which any number of bidirectional streams, unidirectional streams, and datagrams flow in either direction.

Copy link
Copy Markdown
Member

@francislavoie francislavoie Apr 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it's 6 goroutines per connected client? That's still a lot, effectively 3x as heavy as websockets basically. Seems strange. (Just trying to understand the practical semantics.) I would still call that "request" though I realize because of UDP it's looser but it's still the same idea I think, each client would almost always just do one request to set up a websocket, like they would do just one to set up the webtransport, no?

Comment thread modules/caddyhttp/reverseproxy/webtransport_transport.go Outdated
Comment thread modules/caddyhttp/standard/imports.go Outdated
Comment thread modules/caddyhttp/reverseproxy/webtransport_transport.go Outdated
@francislavoie francislavoie added this to the 2.x milestone Apr 23, 2026
@francislavoie francislavoie added the feature ⚙️ New feature or request label Apr 23, 2026
@steadytao
Copy link
Copy Markdown
Member

steadytao commented Apr 23, 2026

Well this is a bit scary from a blast-radius perspective.

Francis seems to be covering the implementation and maintenance issues well already, my separate concern is the architectural one. This appears to put experimental WebTransport handling directly into Caddy’s core HTTP/3 accept path, not just behind an opt-in reverse-proxy feature.

That makes me think this wants splitting or at least a much stronger justification for why the core H3-path change is low risk. Right now it looks like an experimental feature is being paid for by every HTTP/3 deployment, not only the ones that opt into WebTransport.

@tomholford
Copy link
Copy Markdown
Author

Thank you for the feedback. Will review and address.

The WebTransport proxy path in serveWebTransport duplicated the
dynamic-upstream-fallback block and the {http.reverse_proxy.upstream.*}
replacer-variable block from proxyLoopIteration. Francis flagged this
as a maintenance burden in review of caddyserver#7669.

Extract two helpers:

  * resolveUpstreams(r) returns the candidate upstream set — dynamic
    when configured (with provisioning + fallback-on-error), static
    otherwise. Caller runs the LB selection policy, since the two call
    sites diverge on how selection failure is reported (retry loop vs.
    fast 502 for long-lived WT sessions).

  * setUpstreamReplacerVars(repl, up, di) publishes the seven
    placeholders describing the selected upstream.

Both are used by proxyLoopIteration and serveWebTransport with
identical semantics to the inlined code they replace. No behavior
change for either path.
Francis pointed out in review of caddyserver#7669 that importing the whole
modules/caddyhttp/webtransport package solely to pull in one constant
and one interface wasn't worthwhile.

Move both into webtransport_transport.go as unexported identifiers
(webtransportProtocol, webtransportWriter). This removes reverseproxy's
dependency on the caddywt package and clears the way for moving the
echo handler itself out of the production module tree.

No behavior change.
Francis pointed out in review of caddyserver#7669 that the echo handler — which
exists solely as a test upstream for the WebTransport reverse-proxy
tests — should not be a full-fledged module registered in every Caddy
binary. Mirroring the mockdns_test.go pattern, move it into a _test.go
file under caddytest/integration/.

The module ID http.handlers.webtransport is now registered only when
the integration test binary is built, which is when
caddytest/integration/webtransport_test.go references it by ID string
in its JSON configs. Production Caddy builds no longer include it.

Changes:
  * New file: caddytest/integration/webtransport_echo_test.go —
    contains the WebTransportEcho handler, its types and interface
    guards, the isWebTransportEchoUpgrade helper, and the unit tests
    that used to live in the deleted package's handler_test.go.
  * Deleted: modules/caddyhttp/webtransport/ (handler.go + handler_test.go).
  * Removed the blank import from modules/caddyhttp/standard/imports.go.

The Protocol const and Writer interface that this package used to
export were inlined into reverseproxy's own files in a preceding
commit, so nothing else depends on the deleted package.
… server flag

steadytao raised an architectural concern in review of caddyserver#7669: the PR
put experimental WebTransport handling directly into Caddy's core
HTTP/3 accept path, so every HTTP/3 deployment paid for the feature
whether or not they used it.

Collapse the enablement surface to a single server-level opt-in that
matches Caddy's existing precedent for protocol-level features
(`protocols`, `allow_0rtt`, `enable_full_duplex`), and detect the
request shape at the handler the same way `reverse_proxy` detects a
WebSocket upgrade today — no per-handler config flag.

Core HTTP/3 path changes (modules/caddyhttp/server.go):
  * New `EnableWebTransport bool` field on Server, marked EXPERIMENTAL.
  * buildHTTP3Server now only calls webtransport.ConfigureHTTP3Server
    and sets EnableStreamResetPartialDelivery when the flag is true.
    When false, the constructed http3.Server is bit-for-bit identical
    to the pre-WebTransport implementation.
  * wtServer is constructed only when the flag is true.
  * serveH3AcceptLoop falls back to http3.Server.ServeListener when
    the flag is false — no varint peek, no per-connection dispatch.

Caddyfile wiring (caddyconfig/httpcaddyfile/serveroptions.go):
  * New `enable_webtransport` global server option, modeled on
    `enable_full_duplex`.

Reverse-proxy simplifications (modules/caddyhttp/reverseproxy/):
  * Removed HTTPTransport.WebTransport field and its Provision-time
    exclusivity check (no longer needed; H3 is validated separately).
  * Removed the `webtransport` Caddyfile subdirective under
    `transport http { }` — this neutralizes the prior commit that
    introduced it.
  * Removed Handler.webtransportEnabled cache. ServeHTTP now branches
    on isWebTransportExtendedConnect(r) alone, matching how the
    WebSocket upgrade branch works.
  * serveWebTransport gains fail-fast guards with clear errors when
    the parent server has enable_webtransport=false or when the
    handler's transport does not include HTTP/3.

Tests:
  * Existing TestServer_BuildHTTP3ServerEnablesWebTransport now sets
    EnableWebTransport=true explicitly; new
    TestServer_BuildHTTP3ServerWithoutWebTransport locks in the
    regression guard that flag-off produces the pre-PR http3.Server.
  * Integration tests updated: enable_webtransport: true added to
    every H3 server block; "webtransport": true dropped from the
    reverse_proxy transport JSON (auto-detected now).
  * Caddyfile adapt test for the deleted `webtransport` subdirective
    is removed; `enable_webtransport` is added to the existing
    global_server_options_single adapt test alongside its peers.

No runtime behavior change when enable_webtransport is false. Diff
against master on the core HTTP/3 path is effectively zero in that
configuration.
…bTransport

Adds a hermetic pair of benchmarks on buildHTTP3Server to provide
quantitative evidence for the claim that deployments with
enable_webtransport=false pay no cost for the WebTransport feature.

Results on Apple M4, go1.25, -count=5:

  BenchmarkBuildHTTP3Server_WebTransportOff   ~70 ns/op   392 B/op   3 allocs/op
  BenchmarkBuildHTTP3Server_WebTransportOn   ~144 ns/op   600 B/op   6 allocs/op

The Off path is about half the cost on every dimension, confirming
that the work skipped when the flag is false is the
webtransport.ConfigureHTTP3Server call plus EnableStreamResetPartialDelivery.
Absolute cost is a one-time per-server setup so either branch is
negligible in practice, but the asymmetry locks in a regression guard:
a future refactor that accidentally re-enables the WT configuration
unconditionally would show up as a jump in the Off numbers.

This benchmark does not exercise the per-stream dispatch cost inside
webtransport.Server.ServeQUICConn — that would require a full QUIC
setup to measure in isolation and is follow-up work.
@tomholford
Copy link
Copy Markdown
Author

tomholford commented Apr 23, 2026

@steadytao Thanks for flagging; I've pushed commits that fully address this:

bb8b3ee (caddyhttp, reverseproxy: gate WebTransport behind enable_webtransport server flag) collapses the enablement surface to a single server-level opt-in that follows the same pattern as protocols, allow_0rtt, and enable_full_duplex. When off — the default — buildHTTP3Server skips webtransport.ConfigureHTTP3Server, skips EnableStreamResetPartialDelivery, and the accept loop falls back to http3.Server.ServeListener unchanged. wtServer is never constructed. Runtime diff against master on the core HTTP/3 path is effectively zero when the flag is off.

Same commit also drops the per-handler transport http { webtransport } directive that was in the earlier revision. WebTransport Extended CONNECT is now detected at the handler the same way a WebSocket upgrade is detected today: by request shape instead of a config flag. Deployments that want WT proxying just do:

{
    servers {
        enable_webtransport
    }
}

example.com {
    reverse_proxy https://backend:9443 {
        transport http {
            versions 3
        }
    }
}

d67425d adds a pair of hermetic benchmarks to quantify the delta:

goos: darwin
goarch: arm64
cpu: Apple M4
BenchmarkBuildHTTP3Server_WebTransportOff-10    ~70 ns/op   392 B/op   3 allocs/op
BenchmarkBuildHTTP3Server_WebTransportOn-10    ~144 ns/op   600 B/op   6 allocs/op

Off is about half the cost on every dimension, which is exactly the work skipped when the flag is false (ConfigureHTTP3Server + EnableStreamResetPartialDelivery). Absolute cost is a one-time per-server setup so either branch is negligible in practice, but the asymmetry locks in a regression guard: a future refactor that accidentally re-enables WT configuration unconditionally would show up as a jump in the Off numbers. A per-stream dispatch benchmark inside webtransport.Server.ServeQUICConn would require a full QUIC setup to measure in isolation — left as follow-up.

The webtransport-go dep is still pulled in at build time. Keeping it out entirely would require build tags, which isn't how Caddy gates features anywhere else from what I can see; but happy to revisit if that's a blocker.

@tomholford
Copy link
Copy Markdown
Author

@francislavoie @steadytao ready for re-review when you have time. Summary of what landed since the last pass:

  • Francis adding support for php including clean urls and wordpress permalinks #1 (6 goroutines): responded inline — per-session cost driven by webtransport-go's three independent blocking accept APIs × two directions
  • Francis gzip middleware now strips encoding header #2 (duplicated proxy setup): 8a2b853b — extracted resolveUpstreams + setUpstreamReplacerVars, both call sites use them
  • Francis Refactor New() out of all middleware #3 (echo handler as a standard module): 15ca6714 — moved to caddytest/integration/webtransport_echo_test.go following the mockdns_test.go pattern; deleted the production package; dropped from standard imports
  • Francis config: Refactor lexer, parser, controller, dispenser types #4 (caddywt import for one const + one type): 4325098c — inlined both into webtransport_transport.go
  • steadytao (core H3 blast radius): bb8b3eee — collapsed enablement to a single server-level enable_webtransport flag matching the enable_full_duplex / allow_0rtt / protocols precedent. When off (default), the H3 path is bit-for-bit identical to pre-PR. Also dropped the per-handler transport http { webtransport } directive — WT is now auto-detected at the handler the same way WebSocket upgrades are detected. d67425d4 adds a micro-benchmark confirming flag-off is strictly cheaper (~70 ns off vs ~144 ns on per H3 server construction).
  • PR body + appendix updated to reflect the new surface; manual reproduction re-verified end-to-end against the built binary (5 / 50 / 200 concurrent streams, clean SIGTERM) plus a regression check with the flag off confirming WT SETTINGS are not advertised.

@steadytao
Copy link
Copy Markdown
Member

I will allow francis to continue with the review. But from a brief checkover, it seems much better, thank you!

The WT path duplicated upstream resolution, LB selection, header ops,
replacer vars, and in-flight counters. Route WT through the shared
ServeHTTP -> proxyLoopIteration -> reverseProxy flow and swap RoundTrip
for a small webTransportHijack that only does WT-specific work (writer
unwrap, upstream dial, client upgrade, pump).

Rename roundtripSucceededError -> terminalError. The existing name
described when it was emitted (after a successful round-trip); the
new name describes its contract with the retry loop (stop looping,
propagate error unchanged). The WebTransport upgrade case is a second
natural caller for that same signal.

Comes with two behavior improvements that fall out of the collapse:
  - WT upstream dial failures now surface as DialError, so the loop
    can fail over across upstreams like normal proxies (today: 502).
  - Passive health checks apply to WT dials (dial-failure countFailure
    and UnhealthyLatency on dial duration) via the shared path.

Addresses reviewer feedback that the duplicated setup was a
maintenance risk.
Comment thread modules/caddyhttp/reverseproxy/webtransport_transport.go
Comment thread modules/caddyhttp/reverseproxy/webtransport_transport.go
// A WT CONNECT reached this handler because the parent server has
// enable_webtransport=true. But the handler's transport still has to
// speak HTTP/3 to dial the WT upstream.
ht, ok := h.Transport.(*HTTPTransport)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to create an interface and implement that interface for HTTPTransport instead of type assertion.

// own type assertions and cannot see past a wrapper.
func UnwrapResponseWriterAs[T any](w http.ResponseWriter) (T, bool) {
var zero T
for w != nil {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non nil check seems unnecessary, same for the equality check below for unwrap.

return copyMap
}

// resolveUpstreams returns the candidate upstream set for this request:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these functions created when they are not used else where?

i don't think reducing the lines of the original function will affect its readability. As both are simple enough with comments.

// WebTransportServer returns the server's underlying WebTransport
// serving state as an opaque value. Modules that import
// github.com/quic-go/webtransport-go may type-assert it to
// *webtransport.Server. Returns nil if HTTP/3 is not in use.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nil if web transport is not enabled.

// EXPERIMENTAL: this helper is an internal building block for the
// WebTransport reverse-proxy transport and may change.
func runWebTransportPump(clientSess, upstreamSess *webtransport.Session, logger *zap.Logger) {
if logger == nil {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this nil check. Currently in both testing and production a non nil logger is passed. Create a new logger when nil is passed may lead to problems later on.

var wg sync.WaitGroup
wg.Add(6)

// Bidirectional streams in both directions.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use wg.Go, this was introduced in 1.25 and the minimal go version caddy supports.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature ⚙️ New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants